From Syntactical Analysis to Textual Segmentation
نویسندگان
چکیده
In this work a proposal for automatic textual segmentation is described. The proposal uses the output of an automatic syntactic analyzer – Parser Palavras – to create textual segmentation. Parse trees are used to infer text segments and a dependency tree of the identified segments. The main contribution of this work is the use of the syntactic structure as source for the automatic segmentation of texts, as well, as the use of inference rules for the textual organization.
منابع مشابه
Segmentation-Driven Recognition Applied to Numerical Field Extraction from Handwritten Incoming Mail Documents
In this paper, we present a method for the automatic extraction of numerical fields (zip codes, phone numbers, etc.) from incoming mail documents. The approach is based on a segmentation-driven recognition that aims at locating isolated and touching digits among the textual information. A syntactical analysis is then performed on each line of text in order to filter the sequences that respect a...
متن کاملTopic segmentation via community detection in complex networks
Many real systems have been modeled in terms of network concepts, and written texts are a particular example of information networks. In recent years, the use of network methods to analyze language has allowed the discovery of several interesting effects, including the proposition of novel models to explain the emergence of fundamental universal patterns. While syntactical networks, one of the ...
متن کاملImage Retrieval for Information Systems
In order to retrieve a set of intended images from a huge image archive, human beings think of special contents with respect to the searched scene, like a countryside or a technical drawing. Therefore, in general it is harder to retrieve images by using a syntactical feature based language than a language which ooers the selection of examples concerning colour, texture, and contour in combinati...
متن کاملRecognizing Textual Entailment Using Lexical, Syntactical, and Semantic Information
This paper describes our system for participating in the system validation subtask of NTCIR-11 RITE-VAL. We trained a SVM model with LibSVM using features extracted from labeled sentence pairs. Besides features based on lexical, syntactic and semantic analysis, we introduce a novel approach of extracting “concepts” from a sentence and generating features based on it. Unlabeled testing sentence ...
متن کاملDiscrimination Between Digits and Outliers in Handwritten Documents Applied to the Extraction of Numerical Fields
In this article, we propose a numerical field extraction system from unconstrained handwritten documents. The system is based on a segmentation driven by recognition stage followed by a syntactical analysis which detects the sequences that may compose a numerical field. We focus here on the design of a digit classifier embedded in the segmentation/recognition process able to discriminate digits...
متن کامل